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Abstract 

Define the scaled empirical point process on an independent and iden- 
tically distributed sequence {Y; : i < n} as the random point measure 
with masses at a~^Yi. For suitable a n we obtain the weak limit of these 
point processes through a novel use of a dimension-free method based 
on the convergence of compensators of multiparameter martingales. The 
method extends previous results in several directions. We obtain limits at 
points where the density of Yi may be zero, but has regular variation. The 
joint limit of the empirical process evaluated at distinct points is given by 
independent Poisson processes. These results also hold for multivariate 
Yi with little additional effort. Applications are provided both to nearest- 
neighbour density estimation in high dimensions, and to the asymptotic 
behaviour of multivariate extremes such as those arising from bivariate 
normal copulas. 

Keywords: multiparameter martingales; point processes; density estimation; 
multivariate extremes; local empirical processes 
Running title: Empirical point processes 
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1 Introduction 



Point processes and their limits arise naturally in many areas of statistics, and 
have a number of applications ranging from survival analysis to spatial statis- 
tics. Point processes also arise in probability theory as limits for extreme value 
processes, in studying limits of sums of stable non-Gaussian variables and in 
queuing models. Of course the Poisson process is a fundamental concept in 
martingale theory. Weak convergence of the empirical point process underlies 
many applications, and this paper employs the relatively recent area of mul- 
tiparameter martingales to establish a novel and unified approach to proving 
such limits for scaled empirical point processes. Although various elegant and 
powerful methods have been developed for particular classes of problems, the 
generalized martingale approach provides an extremely simple, dimension-free 
method of addressing a variety of old and new distributional questions. 

Given a random sample of random vectors {Yj : i < n} in R d and a suitable 
class of sets {A}, the empirical point process is defined by 

n 

1=1 

As noted above, the weak convergence of has been extensively studied 

using a variety of methods. In particular, a strong approximation approach 
can be used to establish weak convergence of the local empirical process (see 
Einmahl, 1997, and the references therein): 

1 " 

L n ,x{t) = — '^2l{Y i e[x-ta n ,x+ta n ]}, * € [0,1], 
1=1 

where now the Yj's are univariate. If the sequence of constants, a n , is appro- 
priately chosen then the limit process is homogeneous Poisson. However, this 
strong Poisson approximation is difficult to implement (or at least cannot be 
extended directly) if one wants to study the joint behaviour of 

{L n ,xi (")'•••' L n ,x m (0), 

i.e. when estimating the density of Y\ simultaneously at (xi, . . . , x m ) (see Sec- 
tion 4.1). Even in the Gaussian case, where simultaneous approximation by 
independent Wiener processes is known, Deheuvels et al. (2000) points out that 
a major technical difficulty arises in proving independence at separate Xi. 

The aim of this paper is to develop a general and natural approach to weak 
Poisson limits for empirical point processes. It is based on the multiparameter 
martingale theory of Ivanoff and Mcrzbach (2000) and requires only the simple 
computation of so-called *-compensators to identify Poisson limits for scaled 
empirical point processes. The compensator method exploited here is partic- 
ularly attractive in that it is independent of the dimension of the underlying 
random vectors, and so easily generalizes results from the univariate to the mul- 
tivariate case. In addition, the martingale approach allows one to handle the 
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joint behaviour at multiple points with ease through a judicious definition of the 
associated history (filtration). In particular, we shall show that the asymptotic 
behaviour of the local empirical process at distinct points be de- 

scribed by independent Poisson processes, an intuitive but otherwise technically 
challenging result. 

The method has additional benefits. First, only (multivariate) regular vari- 
ation of the density / of Y\ is required, and the limits are explicitly written in 
terms of /. Indeed, we can discover the appropriate scaling constants even when 
/ is regularly varying but /(0) = 0, i.e. a case with inhomogeneous Poisson lim- 
its excluded in Borisov (2000). characterize the distributional behaviour of joint 
extremes for different bivariate copulas. This recovers Einmahl (1997, Corollar- 
ies 2.4 and 2.5) where the limit Poisson process has a product mean measure, 
but also extends to more complex cases (Corollary I4.5|) . In particular, we can 
identify extreme value limits for copulas with asymptotically dependent mul- 
tivariate extremes more simply than methods employing multivariate regular 
variation (c.f. Resnick, 1987). 

The paper is structured as follows. The next section will review key ele- 
ments of the theory of multi-parameter martingales and in particular, the use 
of ^-compensators in proving weak convergence of a sequence of point pro- 
cesses. Section |3] defines the scaled empirical point process generated by a 
sample and establishes point process limits for such processes. This proceeds in 
steps from the classical non-negative and univariate case (yielding limits similar 
to those for extreme value processes), to the multivariate and multidimensional 
cases. In each case the proof simplifies to the straightforward calculation of 
^-compensators, and highlights the universality of the martingale approach. 
Section 0] on applications illustrates the utility of our results by establishing 
for the first time weak limits for nearest-neighbour estimates of joint densities 
(again at several points simultaneously), and by providing new extreme value 
limits for multivariate copulas. 

2 Notation and background: Point processes and 
martingale methods 

We provide a brief introduction to point processes and martingale methods in- 
dexed by general Euclidean spaces using the set-indexed framework introduced 
in Ivanoff and Merzbach (2000). We need definitions mimicking those for mar- 
tingales indexed by R+. 

Set T = R d or R|, and A = {A t = [0, t] : t G T} U {0}, where we interpret 
[0, t] in the obvious way if t ^ R 1 ^. Set-inclusion on A induces a partial order, 
;<, on T: s ^ t if and only if A s C A t . This is not the usual partial order on R d : 
e.g. {0} is the (unique) minimal element, and all quadrants are equipped with 
their own partial order. In particular, if T — R, points with different signs are 
incomparable. This special structure permits us to define a 2 d -sided martingale 
theory. 
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The semi-algebra C is the class of all subsets of T of the form 

C = A\B, A e A, BeA(u), 

where A(u) denotes the class of sets which are finite unions of sets from A. Let 
(ft, T, P) be any complete probability space. A filtration indexed by A(u) is 
a class {Ta '■ A G A(u)} of complete sub-cr-fields of T where VA, B G A(u), 
Ta Q 3~b if A C B, and (Monotone outer-continuity) T^Ai = ^^Ai for any 
decreasing sequence {A{) in A(u) such that HiAi G A(u). For consistency, we 
define JF T = T. We may associate a-algebras with sets in C: for C G C \ A, 
let G* c = V B £A(u),Br\C=$FB, and for A G A, A ^ 0, define Q* A = T % . A 
(A -indexed) stochastic process X = {Xa ■ A G -4} is a collection of random 
variables indexed by A, and is adapted if Xa is .^-measurable for every A G A. 
By convention, A{ } = 0. 

A process X : A — > R is increasing if for every a; G ft, the function A. (a>) 
can be extended to a finitely additive function on C satisfying X^(ui) = and 
Ac(w) > 0, VC G C, and such that if (A n ) is a decreasing sequence of sets 
in A(u) such that n n A n G -4(it), then lim„ Xa„ (w) = A n „A„(^)- A process 
A = {A^4, A G ^4} is a pomi process if it is an increasing process taking its 
values in N, and almost surely for any t G T, = or 1. Note that if N is 
a point process on T = R, then A t := A[ t ] (for t positive or negative) and not 
A(_oo. t ]. As expected, A is a Poisson process on T with mean measure A if A is 
a point process where N c ~ Poisson, A c , VC G C, and whenever C\, C n G C 
are disjoint, Nc ± , Nc n are independent. If A is absolutely continuous with 
respect to Lebesgue measure, its density A is called the intensity of the Poisson 
process. 

An intcgrable process M — {Ma, A G .4} is called a pseudo-strong martin- 
gale if for any C G C, E[Mc\Qq\ = 0. The process AT is a *- compensator of X 
if it is increasing and the difference X — X is a pseudo-strong martingale. The 
asymptotic behaviour of a sequence of point processes may be determined by *- 
compensators as shown in the following theorem specializing Theorem 8.2.2 and 
Corollary 8.2.3 of Ivanoff and Merzbach (2000) to multivariate point processes 
on T = R d or R^. 

To state this theorem, we consider k point processes A(l ),..., N(k) all adapted 
to a common „4(u)-indexed filtration {Ta} and so that with probability one, 
none of the processes have a jump point in common. The fc-variate point pro- 
cess N is defined by Na= (Na(1), A^(fc)) and has (fc-variate) ^-compensator 
A= (A(l), A(fc)) if A(i) is a *-compensator for N(i) with respect to the com- 
mon filtration {J~a}- 

In what follows, " — >p" denotes convergence in probability and " — >£>" de- 
notes convergence in both finite dimensional distribution and in distribution in 
the Skorokhod topology if T = R^ (identifying A t (n) (respectively, N t ) with A^" } 
(respectively, N At ))- We remark that the Skorokhod topology may be extended 
to all of the quadrants in R d on the space of "outer-continuous functions with 
inner limits", and the convergence in the theorem above holds in this case as 
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well. In the sequel, convergence in the Skorokhod topology will be interpreted 
in this way. 

,-("\ 

Theorem 2.1 Let (N ) be a sequence of k-variate point processes on T adapted 

-(«) 

to a filtration {Fa} and (A ) a sequence of corresponding *- compensators. 
Suppose that for each A £ A and i — l,...,k the sequences (N A n \i)) and 
(Ajf\i)) are uniformly integrable and that A^\i) — >p A^(i) where A(i) is a 
deterministic measure on T absolutely continuous with respect to Lebesgue mea- 

-►(n) _ _ 

sure. Then N — >v N, where N= (N(l), N(k)) and N(l), N(k) are 
independent Poisson processes with mean measures A(l), A(k), respectively. 

Proof: The proof of this theorem is a straightforward generalization of the 
techniques used in the proof of Theorem 8.2.2 in Ivanoff and Merzbach (2000) 
along with an application of Watanabe's characterization of the fc-variate Pois- 
son process on R+, see Bremaud (1981, Theorem T6). 

We conclude this section by defining empirical point processes on T and stat- 
ing their *-compensators. Let Y be a T- valued random variable with continuous 
distribution function F. The single jump point process J = {J a = ^{YeA} '■ 
A £ .4} has *-compensator 

Ja= f I { ueA Y }( F ( E ^y ldF ( u ) 

J A 

with respect to its minimal filtration, where E t = {t' £ T : t ^< t'} (cf. Ivanoff 
and Merzbach (2000)). Now, suppose that Yi, ...,Y n are i.i.d. with distribution 
F(t) = P(F 4 < t) and let T = V" =1 J" (t) where T {i) is the minimal filtration 
generated by the single jump process associated with Yi. Then the empirical 
point process defined by 

n 

i=i 

has *-compensator A^™^ where 

A a ) = E / *{ueA Yi }(F(E u ))- l dF(u) . (1) 



Example 2.2 If T = R + then (JIJ reads as follows: A — A t — [0, t] for some 
t > 0, ^ is just <, the standard ordering, E u = [u, oo). By noting that F(E U ) = 
F(Yi > u) =: F(u) we have 
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Example 2.3 If T = H then A t = [0, t] or A t = [i, 0] depending on the sign of 
t. We have s ^ t if < s < t, or t < s < 0, where < is the standard order on ]R. 
Points with different signs are incomparable. The sets E u will be either [u, oo) 
or (— oo, u] depending the sign of u. If u > then as above F{E U ) = F(u), 
otherwise, if u < 0, then F(E U ) = F(u). Now the ^-compensator is given by 
JU if i > and if t < 0, 

Example 2.4 Let Y l = (y a , F^), i = 1, . . . , n. If T = then JTJ reads as 
follows: A = A t = [0,ti] X [Q,t 2 ] for some t = (ii,t 2 ), -E u = {f : t- > = 
1,2}, u = (tti,ti 2 ). By noting that F(J5 U ) = P(T;i > "1,^2 > "2) =: F(u) we 
have 

To extend this example to R 2 we proceed as in Example 12.31 treating each 
quadrant separately. 



3 Poisson limits at quantiles 

3.1 Univariate case 

We can use the previous section to determine the limiting behaviour of empirical 
point processes at quantiles. Consider a sequence {Y n } of i.i.d. real- valued 
positive random variables with distribution F. Assume now that F(0) = and 
that F is regularly varying at with index a > 0, i.e. for all t > 0, 

lim^M=r, (4) 
AO F(x) 

see e.g. Resnick (1987). This implies that for 1 in a neighbourhood of 0, 
F(x) = l{x)x a . Here and in the sequel i is a slowly varying function at or at 
00 as required, and it can be different at each appearance. 

Let a n be such that F(a n ) = rT 1 . This ensures that a„ ~ n _1 / Q £(n) 
for some function I slowly varying at 00. Henceforth, we write c„ ~ d n if 
lim n — .. qq Cji/d n 1. 

Since a n — > we have 

nF(a n t) = ^l^r. (5) 
r \a n ) 

Define 

n 



G 



We have by (JTJ 

Jo ~[ F(a n u) 

We first reprove the well-known result (see e.g. Resnick (1987, Proposition 
3.21) concerning Poisson limits for empirical point processes. An elegant argu- 
ment can be applied (see e.g. Borisov, 2001, and the references therein) where 
the law of N^ n ' is approximated (in the total variation sense and for each n sepa- 
rately) by Poi(v n ), the Poisson random measures with v n (A) = nE{l n i/k Xi ^A)i 
where the X^s are uniform on an appropriately chosen ball. If n is sufficiently 
large, strong approximation methods yield the coupling of the empirical point 
processes to a single Poisson random measure, and weak convergence follows. 
However here we illustrate the martingale approach since the proof easily gen- 
eralizes to the multivariate context and to establishing simultaneous limits at 
interior quantiles of F. 

Theorem 3.1 Assume that F(0) — and ^ holds. Then the sequence (N^ n >) 
converges in distribution to N in the Skorokhod topology on _D[0,oo) where N 
is a Poisson process with mean measure At = t a (intensity \(t) = at ' 1 ). 

Proof. Since is square integrable with bounded second moments (uniformly 
in n) , the conditions of Theorem 12.11 will be satisfied if it is shown that the 
sequence (A^ ) given by © converges in L2 to t a . 



E[A| n) ] = E 



dF(a n u) 
iu) 



Jo ~^ F(a n 

= I:/' P « ^ ^ = E f dF M = nF(a n t) 

~{Jo F(a n u) ~{Jo 

by applying J5|. Using the independence assumption, 

E[(A^] = f ft^HYi > a n u, Yi > a n v}} ^)^(^) 
Jo Jo 7~j F(a n u)F{a n v) 



f'^ r r dF(a n u)dF(a n v) 
-2 / V E [I {Yi > a n u, Yj > a n v}} _ _ V ; 
J° •'O i<i F(a n u)F(a n v) 

'"' " , .. dF(a n u)dF(a n v) 

,(uV«)) 



/ / E p ( y ^ a '> 

Jo 7o i=1 



F{a n u)F{a n v) 



+n(n — 1) I j dF (a n u)dF (a n v) 

IF 

F(a n u)F(a n v) 



... dF(a n u)dF(a n v) 

(uv«)) _ ; v 7 



n / / y2v(Yi>a n { 
Jo Jo i=1 

+n(n-l)(F(a n t)) 2 
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Now, the first term converges to 0, because 



Jo Jo ~[ F(a n u)F(a n v) 

'" ft dF(a n u)dF(a n v) r— ,-,-1 5 
- <n [F(a n t)\ [F(a n t)\ 



< n 



F(a n u) 



and F(a n t) — > 1 and nF 2 {a n i) as n — > oo. So, (A^ converges in L2 and 
therefore in probability. 

We may extend Theorem l3.1l to the entire line. Pp is the probability measure 
associated with a distribution F. We say that F is regularly varying on the right 
(left) at u with index a (/?) if 



lim " + **]) = f 

x\o Pf{(u, u + x\) 



lim 



P F ({u + xt,u]) _ p 



x\fl Pf((u — x, v]) 



(7) 



for all t > and i < 0, respectively. Clearly, if F has support (0, 00) then for 
u = the above condition reduces to |@J). 

If F fulfills (Q) , we shall choose a n and b n so that 

F(a n + u)- F{u) = n- 1 , F(u) - F(-b n + u) = n^ 1 . (8) 

Fix q E (0, 1) and set x q = F^ 1 (q) and assume that F fulfills (7} at u = x q . Let 



N (n H<i) =J2 s a^iY i -* q] m > x q ] +J2s b - 1[Yi _ Xq] i[Y i < X q ]. 



(9) 



The argument in the preceding proof can now be repeated for t > and t < 
to prove that converges in L2 to t a if t > and to if t < 0. Let G be 
the distribution of Yi — x q . Thus, G(s) = F(s + x q ). To see that the norming 
sequences a n and b n are chosen appropriately, using the same calculation as in 
the proof of Theorem 13. II we have for t > 



dG(a n u) = n[F(a n t + x q ) - F(x q )} 
F(a n t + x q ) - F{x q ) 



E 




- "/ 






Jo 



F(a n + Xq) - F(x q ) 



t" 



by the first part of 10). Moreover, bearing in mind Example 12.31 we have for 
t < 0, 



n[F(b n t + Xq) - F{x + q)] 
F(b n t + Xq)-F(x g ) ul/3 



E 




-'I 









F{-bn + Xq)-F{Xq) 

by the second part of Q. Theorem 12 . 1 1 leads to the following Corollary. 
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Corollary 3.2 Assume Q). Tften AT (n) (?) — > v N, where N is a Poisson 
process on ft with intensity 



at 01 ' 1 if t > 

Pltf- 1 if t < 



The power of the martingale method can be seen when one wants to obtain the 
asymptotic joint distribution of several N^{q). 

Theorem 3.3 Let < qi < qi < . . . < qk < 1 and assume that TO holds for 
each x qi , i = 1, k, with a% and Pi, respectively. Then 

(ATC») ( gi ), («,), ... , AK») (q fe )) — ^ <jV(l), . . . , iV(fc)} , 

where (N(l), . . . ,N{k)) is a k-variate Poisson process on ft with independent 
components and marginal intensities \{i), i = l,...,k, given by 



M<) 



af"" 1 if t > 

/^l^ -1 i/ i < 



Proof: For clarity, we will consider only the case k = 2 and verify the conditions 
of Theorem 12 .11 The general result follows in a straightforward manner. 
We begin by observing that it suffices to show joint convergence of 

(N^( qi ),N^(q 2 )) 

for all t G [—K,K] for any arbitrary finite constant K. Assume that F is 
regularly varying on the right and left of x qi with index on and /?< , respectively, 

i = 1,2. As before, define a$ and b„ , i = 1,2 according to (JSJl- Choose M large 
enough that for n > M, [x qi — Kb„\x qi + Kaffl] and [x q2 — Kbn , x q2 + Kan ] 
do not intersect. This will ensure that those points Yj which are jump points of 
N^ n \qi) are not jump points of N^fa) and vice versa. 

Consider for i = 1, 2, 1 < j < n the single jump point process 

J( " j) fe) = * (o «)-i K -«,/K- > + 5 ( 6 W)- 1K -x < ,/K < 
It is adapted to J 7 = {.F^ : —K < t < K}, where 

At I £r < I « 6 [* w+ *^,* w ]}'* = 1 ' 2 5J' = 1 '-' n > if ^°- 

We will compute a ^-compensator J^'^ {qi) of the single jump process J^ n ^\qi). 
We consider only < t < K as the argument for t < is similar. Let 

U t = [x qi - Kb£\x qi + tatf] U [x Q2 - Kb^\x q2 + to,®]. 
Then for C = (t,t'] G C, it follows that I^eUt} ^ £7c and so heuristically, the 
compensator J*' ' J (<?i) satisfies 

7 (»,j) , , _ I{K,g^}^K, +Qn ) Q 

Jit Mi)- l-F{U t ) 
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provided that [x qi —Kb n 1 \x qi +Kan] and [x q2 — Kbffl, x q2 + Ka^] are disjoint 
intervals. Using arguments similar to those in Ivanoff and Merzbach (2000) it is 
straightforward to verify that for n> M the *-compensator K> n '{i) of N^(qi) 
is 

* l{Y ]& u S }dF{x qz +a>n>s) 

1 - F(U S ) • 1 UJ 

Exactly as in the comments leading to Corollary 13.21 we have E[Aj™^(i)] ~ t ai 
for the appropriate constant a*, since F is slowly varying on the right at x qi . 

The argument that E[(Aj (i)) 2 ] — » (t ai ) 2 is also similar to that used in 
the proof of Theorem O Also, N (n) (*) 

is square integrable with bounded 
second moments (uniformly in n) . Thus the conditions of Theorem 12.11 have 
been satisfied and the result follows. 



a, ( " ) w = e/ 



3.2 Multivariate case 

Let {Y n } n >i be a sequence of i.i.d. Revalued random variables with continuous 
distribution F . Following the pattern of the previous section, we may obtain 
a point process limit if the regular variation index at u for F depends on the 
choice of orthant. To be precise, let Ok be the fcth orthant and e& its associated 
unit vector, k = 1, . . . , 2 d . Then F is regularly varying at u from orthant u+Ok, 
with index ctk and rate Wk if for t S Ok 

P F ((u,,t + u]) 
x\o P F ((u,xe k + u}) KK ' K J 

The function Wk is homogeneous of order a.}., i.e. W(st) — s ak W(t), see e.g. 
Resnick (1987). 

We define in analogy to JSJ, i.e. 



k=l i=l 

where 0' k = Ok + u. More generally, if u 3 € R d , j = 1, . . . , to, then we may 
define 

2 d n 

N (n) (j) := ^(n)^.) =^^5 a - ljyi _ Uj . ] /[F i E 0' k>j ] , (12) 
fc=l i=l 

where OJ^ = Ofc + 

Theorem 3.4 Assume that the orthant-wise regular variation conditions Ml)) 
are satisfied at Uj,j = l,...,m, Xj £ R d . For each j, let N^ n \j) denote the 
R d -indexed point process of \T^jj. Then 

(N^(l),N^(2),...,N^(m)) -+v (N(l), N(2), . . . ,N(m)) 
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where (N(l), N(2), . . . , N(m)) is a vector of independent Poisson processes, 
where the jth component process is parameterized by M, d and its mean measure 
is given orthant-wise by the regular variation rates of F at the corresponding 
Uj . 

Examples of regularly varying distributions are readily constructed. One 
source of examples are distributions based on copulas as described, for example, 
in Nelsen (1999) and Section IO below. 



4 Applications 

Remark 4.1 The values of a n depend on the exact asymptotic behaviour of the 
density at x, and certainly will not be known in general. In our applications we 
consider only the special cases where the slowly varying function £(n) is in fact 
a constant, although unknown. We then apply Theorem 13.11 with the scaling 
values equal to nT x l a . We can define a compensator, M n \ by © where a n is 
replaced by n -1 /", and the relation to the original definition is given by 

1 (n- 1 / a /a n )t 

Since linin^oo (n~ 1 ' a /a n ) =w£E, then 

TO) _ A 00 . A 

A « - A («-i/-/a„)t Awt 

and we have convergence of the empirical point process to a Poisson process 
with intensity wcd a ~ x . For example, if the density of Y at is 8, then the 
weak limit of = ££=1 <*„- i/c.y. is a Poisson process with intensity 8at a . 
The corresponding changes to the other theorems of the previous section are 
immediate. 



4.1 Local Density Estimation 

Consider a sample {Yi, Y 2 , . . . , Y n } with common marginal differentiablc distri- 
bution F on [0, 1], and assume that its density / is positive on the range [0, 1]. 
Let F n denote the empirical distribution and define [t]+ = ^fc+i) and = Y(k) 
by Y [k) < t < Y (k+1) . We put [t]+ = if [t}+ < Y {1) and [t]~ = if [t}+ > Y [n) . 
A naive nearest-neighbour estimator of the density at t is given by 

/(n,t) = i/((M+-t) + (t-[t]")) . (13) 

Additional information on nearest-neighbour density estimates can be found in 
Hardle (1990) or Silverman (1992), including comments on performance, and 
modifications. 

For t G (0, 1) the fact that F is differentiable and that / is positive (i.e. F 
is of regular variation index a = 1 at t) allows us to write 

f{n, *)//(«) - 1 /f(t) (n([t]+ - t) + n(t [t]")) 3 l//(t)(S x + E 2 ) (14) 
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where E\ and E 2 are independent exponential variables of mean l//(t). This 
convergence follows from Corollary 13.21 and the continuous mapping theorem, 
and follows the pattern set for extreme value processes as given in Resnick 
(1987). As each limiting Poisson process has a constant rate function equal to 
f(t), the distance from t to the first point has an exponential distribution with 
mean 1/ f(t). Since such an exponential variable can be written as the product 
of 1/ f(t) and an exponential of mean 1, and the sum of two independent mean 
1 exponentials is a T(2, 1) variable, we have identified the limiting distribution 
of f(n,t)/f(t) as Inverse Gamma, r( _1 )(2, 1). The mode, mean and variance of 
an Inverse Gamma density of parameters (a,f3) are f3/(a + 1), (3/(a — 1) (for 
a > 1) and (3 2 /((a — l)(a — 2)) (for a > 2), respectively. Thus we see that this 
naive estimator of f(t) has mode /(i)/3, mean f(t) and infinite variance. 

This development can be easily extended to estimators based on the k lower 
nearest neighbours and k upper nearest neighbours. As above, asymptotically 
the spacings between consecutive neighbours are independent exponential vari- 
ables with mean l/f(t). The asymptotic joint density is the product of 2k 
exponentials, and the sufficient statistic is just the total distance from the lower 
fcth-nearest neighbour of t, [i]^ fe , to the upper fcth-nearest neighbour, [t)^ k . 

Corollary 4.2 The asymptotically uniformly minimum variance unbiased esti- 
mator (k > 1) is 

? (2fc-l)/n 

M ' ^ [*]**-[*]»* ' 

and fk( n :t)/f(t) has an asymptotic r^ _1 ^(2fc, 1) density. 

Using this result we can consequently compute approximate confidence in- 
tervals for f(t) or construct tests. If k is fixed, Theorem 13.31 also identifies the 
limiting distribution of 

(fk(n,t 1 ),f k (n,t 2 ), . . .,fk(n,t m )) 

as given by a vector of m independent scaled inverse Gamma variables. Conse- 
quently we can obtain the limiting distribution of expressions such as approxi- 
mate integrals, 

m 

%(r))=J>(t<)(Jn(t<)), 
i=l 

even for arbitrary dimension fTheorem l3.4l) with appropriate norming. 

Remark 4.3 On the other hand, we see that fk(n,t)/ f{t) still has an Inverse 
Gamma distribution, but with finite variance for k > 1. It has asymptotic 
variance 1 + l/(2fc — 2), and so remains inherently random regardless of the 
fixed number of nearest neighbours used in the estimate. Nearest-neighbour 
methods have become popular in data mining, classification and computing 
applications, and rapid algorithms exist for finding the k nearest neighbours 
to a point t even in high dimensions. The above discussion shows that even 
in highly regular cases, the best ^-nearest-neighbour density estimate will not 
converge in probability to the desired limit, and remains random. 
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As an example of a test that can be constructed using the results of this 
paper, we consider the null hypothesis that F is regularly varying as ut from 
the right at for some u > (e.g. F'(0) — lu > 0). We take the alternative to 
be where F varies as (t 2 from the right for some £ > (i.e. F'(0) = 0). The 
maximum likelihood under the null hypothesis is proportional to ([£]„) ; an d 
that under the alternative proportional to ([t]+ k )~ k x n*=i(MnV $»»*)• 

Corollary 4.4 The likelihood ratio test based on the k > 2 upper nearest neigh- 
bours rejects when 

(fc-i) 

II (MnVMf ) 

i=l 

is too large. Under the null hypothesis, the distribution of this product is given 
by the product of k — 1 independent uniform variables on [0, 1]. 

When k = 2 we obtain an intuitively reasonable test that rejects when the 
distance from to [t]^ 1 is much larger than that from [i]^ 1 to [t]^ 2 , and so 
indicates the presence of a "gap" in the distribution. 

4.2 Multivariate extremes 

Let {(Y n i,Y„,2)} n >i be an i.i.d. sequence of bivariate random vectors. To focus 
on the bivariate dependence structure rather than the marginal distributions, 
we assume that (Yn,Yi 2 ) has a copula C and standard uniform marginals, see 
Nelsen (1999). We want to characterize 

P(Yii > l-xh,Y 12 > l~xt 2 ) 

as x \ 0. If Y\\ and Y\ 2 are independent, then the above probability factors 
and we can apply standard extreme value methods (e.g. Resnick, 1987) to the 
marginals. However, if Y\\ and Y\ 2 are dependent but the maxima are asymp- 
totically independent then the extreme value methods fail; see Fougeres (2004) 
for a general discussion of this problem. For most known families of copulas 
which have the asymptotic independence property, we have (cf. Hefferman, 
2000) 

P(Y U > 1 -xt u Y 12 > 1 -xt 2 ) ~ cx 2 . (15) 

By the results of this paper the appropriate scaling to obtain a point process 
limit for the joint extremes is a n = n -1 ' 2 , and not the a n = n~ x that would be 
used to normalize the marginal variables individually. Note, moreover, that the 
methods of this paper are "dimension free" , and so we can address multivariate 
copulas of any dimension. 

Further we can address the joint extreme value behaviour of copulas with 
the asymptotic independence property but where (|15JI is not satisfied. Consider 
the case when C is the bivariate normal copula with correlation p £ (0,1] - 
i.e. C{x,y) is given by a joint normal distribution function at (<f> (x), ^ 1 (y)) 
with standard marginals and correlation p. We have 

P(Y U > 1 - xt u Y 12 > 1 - xt 2 ) ~ x 2 /( 1+ rig(t u t 2 ) 
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for a function g as x \ 0, and so 

P(Fii > l-a:ti,yi2 > l-gfa) = g(t 1; t 2 ) 
P(Fn > 1 - a?, Fi 2 > 1 - x) g(l, 1) ' 

for ti,ta > 0- F° r u — (1) 1) an d iijia < 0, formula (jl 1|> is satisfied with 

g(-h, -t 2 ) 



W(h,t 2 ) 



.9(1,1) 



Applying the results of Section EOl we can characterize the asymptotics of joint 
extremes for a normal copula. 

Corollary 4.5 Assume that {Y„ = (Y n i,Y n 2)} n >i are independent, have a 
common normal copula of parameter p and uniform marginals. Then for u 
; 1 . 1) and a n = n'^+P^ 2 , 



i=l 

converges to a Poisson process on 1R 2 , with mean measure W(-, •). 
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